Distributed Real-Time Fault Tolerance on a Virtualized Multi-Core System
نویسندگان
چکیده
This paper presents different approaches for real-time fault tolerance using redundancy methods for multi-core systems. Using hardware virtualization, a distributed system on a chip is created, where the cores are isolated from one another except through explicit communication channels. Using this system architecture, redundant tasks that would typically be run on separate processors can be consolidated onto a single multi-core processor while still maintaining high confidence of system reliability. A multi-core chip-level distributed system could therefore offer an alternative to traditional automotive systems, for example, which typically use a controller area network such as CAN bus to interconnect multiple electronic control units. Using memory as the explicit communication channel, new recovery techniques that require higher bandwidths and lower latencies than those of traditional networks, now become viable. In this work, we discuss several such techniques we are considering in our chip-level distributed system called Quest-V.
منابع مشابه
Replication and Resubmission Based Adaptive Decision for Fault Tolerance in Real Time Cloud Computing: A New Approach
Cloud computing an adoptable technology is the upshot evolution of on demand service in the computing epitome of immense scale distributed computing. With the raising asks and welfares of cloud computing infrastructure, society can take leverage of intensive computing capability services and scalable, virtualized vicinity of cloud computing to carry out real time tasks executed on a remote clou...
متن کاملA Fault Observant Real-Time Embedded Design for Network-on-Chip Control Systems
Performance and time to market requirements cause many realtime designers to consider components, off the shelf (COTS) for real-time systems. Massive multi-core embedded processors with network-on-chip (NoC) designs to facilitate core-to-core communication are becoming common in COTS. These architectures benefit real-time scheduling, but they also pose predictability challenges. In this work, w...
متن کاملMulti-Layer Fault Tolerance for Distributed Real-Time Systems
This thesis addresses issues in building fault-tolerant distributed real-time systems. Such systems are increasingly deployed in automotive and avionics applications. We focus on the design and validation of fault tolerance mechanisms. From the design viewpoint, we develop the notion of multi-layer fault tolerance. A fault-tolerant distributed system contains a set of mechanisms that provide er...
متن کاملFault Tolerance in Real Time Distributed System
In this paper we investigate the different techniques of fault tolerance which are used in many real time distributed systems. The main focus is on types of fault occurring in the system, fault detection techniques and the recovery techniques used. A fault can occur due to link failure, resource failure or by any other reason is to be tolerated for working the system smoothly and accurately. Th...
متن کاملStudy and Simulation of a Distributed Real-time Fault-tolerance Web Monitoring System
The goal of this project is to study and simulate a distributed real-time fault-tolerance web monitoring system. The method of providing fault-tolerance is to schedule multiple copies of a task on different computer nodes in a distributed computing system. A fault-tolerant system automatically recovers from a specified number of failures. If the primary task cannot be completed due to a fault, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013